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Abstract 

g> Mutations in CUL7, 0BSL1 and CCDC8, leading to disordered ubiquitination, cause one of 
g the commonest primordial growth disorders, 3-M syndrome. This condition is associated 
§ with i) abnormal p53 function, ii) GH and/or IGF1 resistance, which may relate to failure to 
recycle signalling molecules, and iii) cellular IGF2 deficiency. However the exact molecular 
JS mechanisms that may link these abnormalities generating growth restriction remain 
3 undefined. In this study, we have used immunoprecipitation/mass spectrometry and 
J transcriptomic studies to generate a 3-M 'interactome', to define key cellular pathways and 
•5 biological functions associated with growth failure seen in 3-M. We identified 189 proteins 
"I which interacted with CUL7, 0BSL1 and CCDC8, from which a network including 176 of these 
I proteins was generated. To strengthen the association to 3-M syndrome, these proteins were 
compared with an inferred network generated from the genes that were differentially 
expressed in 3-M fibroblasts compared with controls. This resulted in a final 3-M network of 
131 proteins, with the most significant biological pathway within the network being mRNA 
splicing/processing. We have shown using an exogenous insulin receptor {INSR) minigene 
system that alternative splicing of exon 11 is significantly changed in HEK293 cells with 
altered expression of CUL7, 0BSL1 and CCDC8 and in 3-M fibroblasts. The net result is a 
reduction in the expression of the mitogenic INSR isoform in 3-M syndrome. From these 
preliminary data, we hypothesise that disordered ubiquitination could result in aberrant 
mRNA splicing in 3-M; however, further investigation is required to determine whether this 
contributes to growth failure. 
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Introduction 

Primordial short stature (PSS) is characterised by severe 
pre- and postnatal growth restriction resulting in signi- 
ficant short stature. There are a number of genetic 



syndromes that result in PSS, including the classical 
disorders Seckel syndrome, Meier-Gorlin syndrome and 
microcephalic osteodysplastic short stature types I and II 
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(MOPD I and II) as well as the commoner normocephalic 
(NPSS) syndromes, 3-M and Silver-Russell syndrome (SRS) 
(Eggermann 2010, Clayton et al 2012). 

Over the past decade, genetic causes for these different 
PSS conditions have been successfully identified with 
the predicted functions of these summarised in Table 1. 
The importance of these pathways extends beyond growth 
as they also underpin other developmental processes that 
are associated with metabolic disease, cancer and ageing. 
We have extensively investigated the genetic aetiology 
of 3-M syndrome as a model of NPSS. Unlike many other 
PSS conditions, the 3-M syndrome phenotype is almost 
exclusively growth related with severe pre and postnatal 
growth restriction but no other significant system disorder 
(Hanson et al. 2011a). We have previously identified that 
mutations in three different genes CUL7, OBSLl and 
CCDC8 cause 3-M syndrome (Huber et al. 2005, Hanson 
et al. 2009, 201 IZ?, 2012). CUL7 forms the central 
component of an SCF E3 ubiquitin ligase (Dias et al. 
g5 2002) that localises to the Golgi apparatus (Litterman et al. 
o 2011) and has been shown to be involved in the 
u proteasomal degradation of IRSl (Xu et al. 2008) and 
c cyclin Dl (Okabe et al. 2006). Despite numerous investi- 
ng gations, so far additional targets of CUL7-mediated 
3 ubiquitination have remained elusive. However, it has 

0 been proposed that CUL7 may have a role in the 
"5 degradation of many other proteins via its interaction 
g with CULl in the formation of an ubiquitinating 

1 CUL1/CUL7 heterocomplex (Tsunematsu et al. 2006). 
OBSLl on the other hand is a postulated cytoskeletal 
adaptor protein that is required for CUL7 localisation and 
has been implicated in the regulation of Golgi morpho- 
genesis in neural dendrites (Litterman et al. 2011). Both 
CUL7 and CCDC8 are known interacting proteins of 
p53, acting as co-factors in p53-mediated apoptosis (Kim 
et al. 2007, Dai et al. 2011). There is little apparent 



similarity between the three proteins; however, the 
near identical phenotype of 3-M syndrome patients 
regardless of mutation type and the fact that OBSLl 
co-immunoprecipitates with CUL7 and CCDC8 (Hanson 
et al. 201 IZ?) has suggested a common biochemical 
pathway. In terms of the clinical and biochemical 
phenotype of 3-M syndrome, we have demonstrated that 
i) 3-M children with mutations in CUL7 are significantly 
shorter than those with either OBSLl or CCDC8 mutations 
(Hanson et al. 2012), ii) there is clinical evidence of GH 
and/or IGFl resistance (Hanson et al. 2012), iii) associated 
with this, growth factor signalling in exvivo 3-M fibroblast 
cells is disrupted (Hanson et al. 2012), and iv) IGF2 
expression and IGF2 secreted from 3-M fibroblasts is very 
low (Murray et al. 2013). 

The mechanisms that link these observations are not 
defined, and therefore we have taken a 'systems' approach 
to elucidate the proteins/genes that may be implicated in 
the 3-M syndrome pathway. Protein-protein interactions 
can be mapped to create networks and in recent years 
larger-scale experimental workflows have been used to 
discover the physical interactions between different 
proteins allowing ever more complex interactome net- 
work models (Cho et al. 2004). These can range from 
whole organism to disease-specific interactomes (Gandhi 
et al. 2006, Lim et al. 2006). Known protein-protein 
interactions are often compiled into various databases, 
including Search Tool for the Retrieval of Interacting 
Genes/Proteins (STRING) (Franceschini et al. 2013) and 
Biological General Repository for Interaction Datasets 
(BioGRID) (Chatr-Aryamontri et al. 2013) and these 
along with experimental data can facilitate the mapping 
of biological networks. 

In this study, we have used proteomic and transcrip- 
tomic approaches to identify the putative interacting 
partners of CUL7, OBSLl and CCDC8 to create a 3-M 



Table 1 Summary of the genetic causes of primordial short stature disorders 



Primordial short stature condition 

Normocephalic 
3-M syndrome 



Silver-Russell syndrome 

Microcephalic 
Seckel syndrome 

Meier-Gorlin syndrome 

MOPDI 

MOPDII 



Genetic causes 



Postulated function 



CUL7, 0BSL1. CCDC8 



^^Jp^5 H19/IGF2 hypomethylation, 
maternal UPD7 

ATR, ATRIP, CENPJ, CEP152 

0RC1, 0RC4, 0RC6, CDT1, CDC6 

RNU4ATAC 

PCNT 



Cullin E3 ubiquitin ligase which targets 
IRS1 and cyclin D1 for proteasomal 
degradation 

Imprinting defects which affect expression 
of the foetal growth factor IGF2 

DNA damage response and centriole 

biogenesis 
DNA replication complex 
Minor spliceosome 

Centrosome and DNA damage response 
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syndrome interactome. These interactions have allowed 
us to identify key pathways and biological functions in 
3-M syndrome. We have tested the impact of the most 
significant pathway, namely mRNA splicing, on cellular 
function. 



Materials and methods 

Ethics statement 

Skin fibroblasts derived from 3-M syndrome patients and 
appropriate control individuals were used in this study. 
Institutional ethical approval (Central Manchester Local 
Research Ethics Committee 06/Q1407/21) was granted 
and informed written consent was obtained from all 
patients and control subjects. Details of samples used 
have been described previously (Hanson et al. 2012, 
Murray etal. 2013). 

Immunoprecipitation 

HEK293 cells were obtained from HPA culture collection 
and grown under normal growth conditions in DMEM 
supplemented with 10% foetal bovine serum. The cells 
were transfected using Effectene transfection (Qiagen) 
reagent following the manufacturer's protocol for 
plasmids expressing either CUL7, V5-OBSL1 or CCDC8, 
which have been described elsewhere (Hanson et al. 
201 1Z7). For each of the CUL7-HEK293, V5-OBSL1- 
HEK293 and CCDC8-HEK293, immunoprecipitation (IP) 
experiments transfected HEK293 cells from six 150 mm 
culture dishes were lysed in ice-cold IP buffer (Pierce, 
Rockford, IE, USA) with protease inhibitor (Sigma) 24 h 
post-transfection. Protein complexes were immunoprecpi- 
tated with 5 |ig of either CUL7, V5 (for OBSLl) or CCDC8 
specific antibodies (Sigma; AbD Serotec, Oxford, UK; 
Novus Biologicals, Cambridge, UK) and collected using 
100 III of protein G Dynabeads (Invitrogen) following the 
manufactures recommended protocol. After washing three 
times in 800 \A of ice-cold IP buffer and a further two times 
in ice-cold PBS to remove unbound proteins, the immu- 
nocomplexes were eluted from the beads by boiling in 
60 |il SDS sample buffer before separated by SDS-PAGE. 

Furthermore, transfected HEK293 cells (one set of 
each of CUL7-HEK293, V5-OBSL1-HEK293 and CCDC8- 
HEK293) were immunoprecipitated in the same way, each 
from six 150 mm cell culture dishes except no antibody was 
used for the IP stage. The three samples of no antibody 
control IP were generated to serve as the background 
negative controls for mass spectrometry (MS) analysis. 



The CUL7-HEK293, V5-OBSL1-HEK293 and CCDC8- 
HEK293 IP samples and the three background negative 
control IP samples were separated by SDS-PAGE. 
Following coomassie blue staining, gel lanes were cut 
into small slices (approximately ten 1 mm"^ slices for each 
lane). The gel slices were dehydrated by acetonitrile 
(ACN), rehydrated in reduction buffer (10 mM dithio- 
threitol, 25 mM NH4HCO3), alkylated (55 mM iodoaceta- 
mide, 25 mM NH4HCO3) and then digested with 
sequencing grade trypsin (Promega). The peptides were 
extracted from the gel slices once with 20 mM NH4HCO3 
and then twice with 5% (v/v) formic acid in 50% (v/v) 
ACN, samples of 20 |il concentration were ready for 
analysis by GeLC-MS/MS. GeLC-MS/MS analysis of the 
digested gel slices was carried out as described previously 
(Humphries etal 2009). 

Confirmatory IPs were carried out using transfected 
HEK923 cells (transfected with either CUL7, V5-OBSL1 or 
CCDC8 plasmids as described previously) from a single 
150 mm culture dish and processed in the same manner 
as described earlier, using specific antibodies to CUL7, 
V5, CCDC8 or with no antibody as negative control IPs. 
Samples were separated by SDS-PAGE and immuno- 
blotted with specific antibodies to CUL7, V5, CCDC8, 
HNRNPU (Santa-Cruz Biotechnology, Dallas, TX, USA), 
TP53 (Santa-Cruz Biotechnology), CCT2 (Cell Signaling, 
Danvers, MA, USA), XRCC5 (Cell Signaling) and CDKl 
(Cell Signaling). 

Data analysis 

MS data cleaning To reduce the likelihood of false- 
positive results within each of the IP/MS datasets, we 
undertook a number of measures including removing any 
proteins from the datasets that only had one matching 
peptide sequence from MS. We conducted three separate 
control IPs with no antibody to remove proteins that bound 
non-specifically to the dynabeads used in the IP process. 
Proteins that were present in any of these three no antibody 
control IPs were subsequently removed from the CUL7, 
OBSLl and CCDC8 IP/MS datasets (if present) to provide a 
stringent putative interacting protein list for each IP. 

Cytoscape analysis After removal of background 
interactions, to improve the stringency of the IP/MS data 
and because CUL7, OBSLl and CCDC8 had previously 
been shown to be the components of a common 
biochemical complex (Litterman et al. 2011) suggesting 
they would share the majority of the same interacting 
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partners, we identified only those proteins that were 

present in all three IP experiments by computing the 

intersection of the CUL7 IP/MS, OBSLl IP/MS and CCDC8 

IP/MS datasets for further analysis. The BioGRID database 

of interactions (build 3.1.103) was used to construct an 

'IP/MS network' of known interactions between the 

proteins that were common to the CUL7, OBSLl and 

CCDC8 IPs and this was visualised using Cytoscape (v2.8). 

In tandem, we also identified gene probes that were 

differentially expressed between 3-M syndrome (n = 4) and 

control fibroblast cells {n = 3). RNA gene expression was 

assessed by Affymetrix microarray (HU-133 plus 2.0 chip) 

and Robust Multi-Array (RMA) analysis was used to 

normalise the microarray data to generate an expression 

level for each probe. The dataset and samples used have 

been described previously (Murray et al. 2013). For this 

analysis, the probes were determined to be differentially 

expressed if the fold change difference between 3-M and 

control was ±2. The resulting dataset of 913 probes 

g; (which corresponded to 683 distinct genes) was used to 
o 

o generate an inferred protein-protein interaction model 

"5 using BioGRID, the Transcriptomic network'. To improve 
o 

"o the robustness of the IP/MS network, we took the 

"i! intersection between the IP/MS and transcriptomic 

g networks to generate a multi-omic '3-M interactome'. 

o Therefore, the 3-M interactome contained only proteins 

^ that were identified to be interacting with CUL7, 
o 

75 OBSLl and CCDC8 and which were also shown to be 

5 associated with differential gene expression in fibroblast 
o 

cells from 3-M syndrome patients compared with normal 
healthy controls. 

We next used the Reactome database (Croft et al. 
2011) and Webgestalt Pathway Commons (Wang et al. 
2013) to characterise the cellular functions of the putative 
interacting proteins and identify over-represented biologi- 
cal pathways within the overall 3-M interactome. We used 
hypergeometric testing to determine whether the number 
of genes associated with each pathway identified was 
greater than would be expected by chance. We selected a 
small number of proteins from the pathways identified 
within the 3-M interactome, for which antibodies were 
available, for further IP experiments in order to confirm 
the interactions with CUL7, OBSLl and CCDC8. 

Key network nodes can be identified through the 
analysis of network properties including connectedness 
and centrality. We used the ModuLand cytoscape plugin 
to analyse the network properties of the 3-M interactome 
and generate clusters (or modules) represented by key 
network nodes. The function of these central nodes 
best predicts the function of the module it represents 
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(Szalay-Beko et al. 2012). The central nodes are also 
likely to represent the key functional elements of the 
overall network and therefore can be used to prioritise 
future work. 

Insulin receptor minigene construct An insulin 
receptor (INSR) minigene plasmid was kindly provided as 
a gift by Dr Nicholas Webster, University of California 
San Diego. The minigene contains 110 nucleotides of 
exon 10, 2.2 kb of intron 10, 36 nucleotides of exon 11, 
372 nucleotides of intron 11 and 103 nucleotides of 
exon 12. Intron 11 is a large 7.4 kb intron, but only ~ 180 
nucleotides were cloned at both the 5^ and 3^ ends 
(Talukdar et al. 2011). The INSR minigene spans a region 
of alternative splicing, where inclusion of exon 11 
gives rise to IR-B isoform and exclusion of exon 11 to 
IR-A isoform. 

Cell culture and transfections For the INSR mini- 
gene assay, we used HEK293 cells and skin fibroblasts 
derived from 3-M syndrome patients and appropriate 
control individuals. Both cell types were maintained in 
DMEM supplemented with 10% FBS and grown at 37 °C 
at 5% CO2. HEK293 cells were transfected as previously 
described, with either INSR minigene alone or with each 
3-M gene plus INSR minigene. While skin fibrobalsts cells 
(controls and cells from 3-M syndrome patients with 
either CUL7, OBSLl or CCDC8 null mutations, as 
described previously (Hanson etal. 2012)) were transfected 
with INSR minigene alone. 

RNA extraction and amplification of cDNA The 

cells were harvested 24 h after transfection and total RNA 
was extracted using PureLink RNA mini kit (Life Tech- 
nologies) following manufacturer's protocol. Contami- 
nating genomic DNA was removed by DNase I treatment 
and cDNA generated following manufacturer's protocol 
(High capacity RNA to cDNA kit. Life Technologies). INSR 
minigene transcripts were amplified by plasmid-specific 
primers as described previously (Kosaki et al. 1998) and 
PCR products visualised on 4% agarose gels. Relative levels 
of IR-B and IR-A were assessed by gel densitometry using 
Image J software. 

Results 

IP/MS of CUL7, 0BSL1 and CCDC8 immunocomplexes 

The immunopurified protein complexes from HEK293 
cells exogenously expressing either V5 tagged OBSLl, 
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untagged CUL7 or untagged CCDC8 were analysed by 
in-gel liquid chromatography/tandem MS (GeLC-MS/MS) 
to identify the proteins binding to OBSLl, CUL7 and 
CCDC8. To decrease the likelihood of false-positive 
results, we selected only those proteins with multiple 
peptide matches present in the GeLC-MS/MS for inclusion 
in our network analysis. We identified a total of 49 
proteins (Supplementary Table 1, see section on supple- 
mentary data given at the end of this article) that were 
present in the MS analysis of three independent negative 
control IPs (background IP with no antibody) and these 
were removed from each of the experimental datasets 
as false positives. 

Within the resulting IP/MS datasets, we identified 618 
putative CUL7-interacting proteins, 593 putative OBSLl - 
interacting proteins and 534 putative CCDC8-interacting 
proteins. There was a high degree of overlap between each 
of these datasets with 189 putative interacting proteins 
that were identified as common components in all three 

g; of the IP/MS experiments (Supplementary Table 1). 

o 

o 

c 

o Network analysis 
c 

To determine the likely molecular functions of the 3-M 

g syndrome pathway and the putative interacting proteins, 

o we used the BioGRID cytoscape plugin to create and 

^ visualise protein-protein interaction network models 

75 using the IP/MS data. Using the BioGRID database (build 

5 103), these putative interacting proteins created a network 
o 

of 176 proteins with 1031 connections between them, 
which we have termed the TP/MS network' (Supple- 
mentary Figure lA, see section on supplementary data 
given at the end of this article). 

To strengthen the validity of these interacting 
proteins, we simultaneously generated an interaction 
network using the BioGRID database derived from 
transcriptomic data of mutation positive 3-M syndrome 
patients. Using gene expression data (Murray et al 2013) 
comparing fibroblast cells of 3-M syndrome patients 
(n = 4) to age matched normal healthy control individuals 
{n = 3), we identified 913 probe sets differentially expressed 
between 3-M syndrome patients and control samples 
which represented 683 distinct genes (Supplementary 
Table 2). The BioGRID database was used to infer an 
interaction network from the 683 distinct genes resulting 
in an overall Transcriptomic network' of 3534 proteins 
with 6054 connections (Supplementary Figure IB). 

We next compared the IP/MS and the trancriptomic 
BioGRID networks, identifying that 141 proteins were 
present in both networks representing a significant 
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overlap between the two networks (hypergeometric 
probability, P= 7.32X 10"^^). These 141 proteins rep- 
resent the overall 3-M interactome and are proteins that 
were identified in the CUL7, OBSLl and CCDC8 IP/MS 
datasets and within the network generated from genes 
that are differentially expressed in 3-M syndrome. The 
subsequent BioGRID network generated from the 3-M 
interactome contained 131 of these proteins with 721 
connections (Fig. lA). 

Pathway analysis of the 3-M interactome 

We analysed the 131 proteins from the BioGRID-derived 
network to identify the cellular pathways that are 
associated with the 3-M interactome. This pathway analysis 
showed significant over-representation of mRNA splicing/ 
processing, metabolism of proteins, cell cycle, apoptosis 
and DNA repair pathways (Tables 2 and 3). In addition 
Webgestalt analysis also identified an over-representation 
of a number of signalling pathways most notably the 
Insulin, IGFl, VEGF and mTOR pathways (Table 3). At an 
individual protein level, we identified that ten of the 20 
known major heterogeneous ribonucleoprotein (HNRNP) 
complex proteins (Chaudhury et al 2010) along with 
other RNA-binding proteins and ribosomal subunit 
proteins in particular were amongst the most abundant 
within the combined 3-M interactome. The network 
properties including node (protein) centrality and connec- 
tivity were used to determine community centrality of 
each node within the 3-M interactome. This was assessed by 
the ModuLand method to identify the nodes which best 
represent the function of the overall network and revealed 
15 key 3-M interactome modules (or node centres) (Fig. IB). 

Additional IPs to confirm interactions 

We next performed additional IPs in HEK293 cells over- 
expressing CUL7, V5-OBSL1 and CCDC8 using specific 
antibodies to either CUL7, V5 or CCDC8. In each of the 
CUL7, V5-OBSL1 and CCDC8 IPs we were able to recover 
proteins within a number of the key pathways associated 
with the network as confirmation of their association 
within the 3-M interactome which were not present in the 
'no antibody control' IPs. This includes two central nodes 
identified by ModuLand, XRCC5 and CCT2. We confirmed 
interactions with proteins in a number of pathways, 
including mRNA splicing/processing (HNRNPU), meta- 
bolism of proteins and protein folding (CCT2), double- 
strand repair. Non-homologous end- joining (XRCC5) and 
cell cycle (TP53 and CDKl) (Fig. IC). 
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Figure 1 

The 3-M interactome. (A) Cytoscape grid layout of the 131 proteins with 
721 connections between them that form the 3-IVI interactome. Network 
was generated through identifying proteins present in both the IP/IVIS 
network and the transcriptomic network. Physical interactions are shown 
by orange connections and interactions which are both physical and 
genetic shown by blue connections. Nodes are assigned and coloured 
according to the central node where they most belong. (B) ModuLand 
network representing the key nodes within the overall network designated 



by degree of interactions and network centrality. (C) Immunoprecipitation 
of V5-OBSL1-overexpressing HEK293 cells (left panel, 0BSL1-V5 IP), 
CUL7-overexpressing HEK293 cells (middle panel, CUL7 IP) and CCDC8 
overexpressing HEK293 cells (right panel, CCDC8 IP) with western blotting 
to identify co-immunoprecipitated proteins to confirm the putative 
interactions identified by IP/MS. Protein inputs (Input) and control IPs with 
no antibody (no Ab IP) are shown for each panel. 



CUL7, OBSL1 and CCDC8 modulate the 
alternative splicing of the INSR 

RNA splicing is the most significantly associated cellular 
pathway within the 3-M interactome and HNRNP proteins 
are amongst the most common components of this 
pathway. We have confirmed the interaction of HNRNPU 
with all three 3-M proteins and also identified that 
HNRNPAl and HNRNPF are in the 3-M interactome. 
Talukdar et al. (2011) have recently demonstrated that 
HNRNP F, HI and U bind to the splicing motif of intron 10 
of INSR and where HNRNPAl promotes exon 11 exclusion 
and HNRNPF promotes exon 1 1 inclusion. The alternative 
splicing of INSR gives rise to two different protein isoforms 
IR-A (- exon 11) and IR-B (+ exon 11) (Belfiore et al. 
2009). To determine if CUL7, OBSLl and CCDC8, through 
their interaction with HNRNPs and other members of the 
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splicing machinery, also to regulate alternative splicing 
events, we have used an INSR minigene system to deter- 
mine the effect of the 3-M proteins on the inclusion/ 
exclusion of exon 11 of INSR. In fibroblast cells from 
normal control patients and those derived from 3-M 
syndrome patients we show that loss of CUL7, OBSLl or 
CCDC8 leads to a reduction in IR-A isoform and therefore 
an increase in the ratio of IR-B to IR-A expression (Fig. 2A). 
Conversely overexpression of CUL7, OBSLl or CCDC8 
in HEK293 cells results in an increase in IR-A expression 
and subsequent decrease in IR-B to IR-A ratio (Fig. 2B). 



Discussion 

In this study, we have been able to combine experimental 
IP/MS and transcriptomic data from 3-M syndrome 
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Table 2 Reactome analysis of the 3-M interactome 



Un-adjusted 










probability of 


Number of 








seeing N or 


genes in 


Total number 






more genes in 


your query 


of genes 






this Event 


which map 


involved in 






by chance 


to this event 


this event 


Name of this event 


Submitted identifiers mapping to this event 


1.67X10"''^ 


17 


112 


mRNA splicing 


SNRNP200, PTBP1, YBX1, SMC1A, HNRNPAO, HNRNPF, 
HNRNPH1, PRPF8, EFTUD2, DHX9, PCBP2, SRSF9, 
HNRNPA1, HNRNPL, HNRNPU, RBMX, HNRNPR 


4.37X10"^^ 


17 


136 


mRNA processing 


SNRNP200, PTBP1, YBX1, SMC1A, HNRNPAO, HNRNPF, 
HNRNPH1, PRPF8, EFTUD2, DHX9, PCBP2, SRSF9, 
HNRNPA1, HNRNPL, HNRNPU, RBMX, HNRNPR 


3.64X10"^° 


40 


1031 


Gene expression 


SNRNP200, PTBP1, IGF2BP3, YBX1, RPS3A, ELAVL1, 
RPLPO, HNRNPAO, RPL18, HNRNPF, EEF1G, EEF1A1, 
IGF2BP1, RPL14, RPS4X, RPS2, PCBP2, RPS8, HNRNPA1, 
RPL10A, PABPC1, HNRNPR, EIF4A1, SF1, SMC1A, 
HNRNPH1, RPL11, PRPF8, RPL7A, EFTUD2, PARP1, 
KHSRP, PPP2R1A, DHX9, SRSF9, RPL8, HNRNPU, 
HNRNPL, RBMX, TRIM28 


6.24X10~^° 


29 


574 


Metabolism of 
proteins 


EIF4A1, HSPD1, RPS3A, LMNA, CCT6A, CCT3, RPLPO, 
RPL18, PDIA3, RPL11, EEF1G, EEF1A1, HSP90B1, CCT2, 
RPL7A, HSPA5, PDIA6, CCT8, TCP1, RPL14, RPS4X, 
RPS2, RPL8, RPS8, HSPA9, ATP5B, RPL10A, NOP56, 
PABPC1 


3.21 X10~^ 


13 


109 


3'-UTR-mediated 
translational 
regulation 


EIF4A1, RPL7A, RPS3A, RPL14, RPS4X, RPS2, RPL8, RPS8, 
RPLPO, RPL18, RPL10A, PABPC1, RPL1 1 


5.37X10"^ 


3 


6 


Nonhomologous end 
joining (NHEJ) 


XRCC5, PRKDC, XRCC6 


5.90X10"^ 


10 


154 


Apoptosis 


LMNB1, CAD, LMNA, TJP1, YWHAE, DSG2, YWHAQ, DSP, 
KPNB1, PLEC 


9.36X10-^ 


6 


53 


Protein folding 


CCT2, CCT8, NOP56, TCP1, CCT6A, CCT3 


0.000684505 


8 


137 


Cell-cell 
communication 


FLNA, ACTN4, MLLT4, J UP, KRT14, IQGAP1, KRT5, PLEC 


0.001024479 


16 


478 


Cell cycle 


LMNB1, DYNC1H1, LMNA, CDK1, SMC1A, T0P2A, TP53, 
EMD, TPR, PPP2R1A, NUP93, MCM7, YWHAE, NUMA1, 
NUP205, NPM1 


0.001530044 


14 


403 


Cell cycle, mitotic 


LMNB1, EMD, TPR, DYNC1H1, PPP2R1A, LMNA, CDK1, 
YWHAE, SMC1A, MCM7, NUP93, NUMA1, T0P2A, 
NUP205 


0.003057906 


3 


21 


Double-strand break 
repair 


XRCC5, PRKDC, XRCC6 


0.004274334 


10 


266 


Mitotic M-M/G1 
phases 


LMNB1, EMD, TPR, PPP2R1A, LMNA, CDK1, SMC1A, 
MCM7, NUP93, NUP205 


0.016112501 


21 


915 


Disease 


RPS3A, CDK1, RPLPO, RPL18, RPL11, KPNB1, RPL7A, TPR, 



PPP2R1A, RPL14, RPS2, RPS4X, NUP93, RPL8, XRCC5, 
RPS8, HDAC2, RPL10A, NUP205, XRCC6, NPM1 



patients to generate a disease interactome. We have 
associated molecular pathways with this interactome to 
identify biological processes that underlie this PSS 
condition. Some of the proteins identified in this study, 
which form the 3-M interactome, are likely to be ideal 
candidate short stature genes that may be defective in 
undiagnosed 3-M syndrome or in similar PSS disorders. 
The association of molecular pathways with the 3-M 
syndrome proteins has given us further insights into the 
molecular mechanisms of growth restriction seen in this 
condition and potentially other short stature disorders. 
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There are potential limitations of using an IP/MS 
approach to identify the interacting partners of a 
particular protein; this includes the possibility of identify- 
ing both direct and indirect interactions. Future studies, 
for example, utilising Forster resonance energy transfer 
(FRET) experiments between the 3-M proteins and a 
number of the key interacting partners could determine 
whether these are direct interactions and therefore 
directly associated with the 3-M pathway. Nevertheless, 
it is clear that there is a strong association of RNA 
processing, ribosome and cell cycle pathways within the 
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■ Control 
fibroblasts 

■ CUL7-/- 
fibroblasts 

□ 0BSL1"/" 
fibroblasts 

□ CCDC8"^" 
fibroblasts 



■ HEKs 

■ HEKS+CUL7 

□ HEKS+0BSL1 

□ HEKS+CCDC8 



- IR-B 

- IR-A 



Figure 2 

INSR minigene assay. (A) Quantification of alternative splicing of INSR 
minigene in fibroblast cells. Control cells (n = 3) and fibroblasts from 3-M 
syndrome patients, CUL7"^", OBSLI"^" and CCDC8"^", were transfected 
with an INSR minigene construct and relative levels of INSR were measured 
by RT-PCR analysis. Graph indicates the relative expression of IR-B/IR-A 
as a mean for n= 10 transfection experiments for each cell type, 
a representative gel is shown below the graph. Error bars represent s.e.m. 
(B) Quantification of alternative splicing of INSR minigene in HEK293 cells. 
HEK293 cells were transfected with INSR minigene construct only (labelled 
HEKs, n = 8 transfection experiments) or with minigene and a CUL7 
expression vector (HEKs + CUL7, n = 5 transfection experiments), with 
minigene and a 0BSL1 expression vector (HEKs + OBSLI, n = 5 transfection 
experiments) and with minigene and a CCDC8 expression vector (HEKs + 
CCDC8, n = 5 transfection experiments). Graph indicates the mean relative 
expression of IR-B/IR-A for each combination of transfections as indicated, 
a representative gel is shown below the graph. Error bars represent s.e.m. 



CUL7, OBSLl and CCDC8 networks. In particular, in each 
of the IP/MS datasets there was a high proportion of RNA 
binding/processing proteins with a highly significant 
probability of enrichment in pathways associated with 
either RNA processing or splicing (Supplementary Table 1) 
and therefore likely that at least some of these would be 
direct interactions. The association of RNA binding 



proteins was also supported by additional IP of HNRNPU 
with all three 3-M proteins (Fig. IC). 

The possibility of false-positive interactions is often 
regarded as a weakness with MS-derived data. We used a 
stringent analysis protocol in which only proteins that 
were present in all three experimental IPs but not in any of 
the three negative-control IPs were identified as potential 
interacting proteins. To further increase confidence in our 
data, we used a multi-omic approach using transcriptomic 
data from 3-M syndrome patients' fibroblast cells along- 
side the IP/MS data. The common proteins within these 
datasets defined the overall 3-M syndrome interactome. 
As a measure of the robustness of the analysis we applied 
to the IP/MS data, there was a high degree of overlap 
between the IP/MS and transcriptomic data with 141 of 
the 189 proteins in the IP/MS data also present in the 
transcriptomic network. 

Our data is in alignment with recent studies on the 
function of the different 3-M proteins; Litterman et al. 
(2011) recently demonstrated that OBSLl is a major 
component of the CUL7 SCF complex which also includes 
an F-box specificity factor, FBXW8. These IP studies 
identified that five members of the T-complex protein 1 
(TCPl) chaperonin complex (CCT2, CCT3, CCT6A, 
CCT6B and CCT7) are putative interacting partners of 
FBXW8. Supporting this observation, we also found four 
members of this protein family (TCPl, CCT2, CCT3 and 
CCT6A) were present in the 3-M interactome and predict 
they may act as adaptor proteins within the CUL7 SCF 
complex. IP experiments from lysates of HEK293 cells 
overexpressing CUL7, V5-OBSL1 and CCDC8 confirmed 
the interaction between CCT2 and the 3-M proteins 
and CCT2 was also one of the key network nodes within 
the 3-M interactome. 

P53 is a major tumour suppressor gene that is vital for 
maintaining normal cell growth and in particular is 
central to the stress response of cells (Steele et al. 1998). 
Numerous studies have identified that CUL7 interacts 
with p53 and that the CUL7 SCF complex is able to 
monoubiquitinate p53; however, it is unlikely to be a true 
proteasomal degradation substrate (Andrews et al. 2006, 
Kasper et al. 2006, Kaustov et al. 2007). Knockdown of 
CUL7 increases p53-mediated inhibition of cell cycle 
progression, while CUL7 overexpression represses p53 
induction after DNA damage suggesting CUL7 is an 
antiapoptotic oncogene Qung et al. 2007, Kim et al. 
2007). Acetylation of p53 by KAT5 (also known as Tip60) 
is thought to play a role in the activation of p53 in stress 
response and induces p53-mediated apoptosis. Recently 
CCDC8 was shown to interact with both p53 and KAT5 
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and is required for activation of BBC3 (also known as 
PUMA) during p53-mediated apoptotic response (Dai etal. 
2011). Our IP studies support the interaction between 
CUL7 and p53 along with the interaction between CCDC8 
and p53 while also implying that OBSLl associates with 
p53 as part of this complex. 

In some MPSS disorders, mutations in genes associated 
with DNA damage and cell cycle have been identified. This 
includes mutations in the DNA damage response kinase 
ATR as a cause of Seckel syndrome and PCNT mutations, 
which have been identified in MOPDII. Cell lines derived 
from patients with PCNT mutations have been shown to 
have disrupted signalling of ATR-dependent DNA damage 
response. CDKl is a key regulator of the ATR signalling 
pathway required for G2/M transition. It has been shown 
previously that mutations in ATR, ATRIP and CEP152 
associated with PSS results in loss of function of these 
genes which impairs the activity of the ATR signalling 
pathway and therefore alters the G2/M checkpoint 
(Klingseisen & Jackson 2011). Our 3-M interactome 
identified that a number of cell cycle and DNA damage 
response proteins are associated with 3-M proteins, 
resulting in significant over-representation of these 



pathways (Tables 2 and 3). CDKl was also confirmed as 
an interacting partner of the 3-M proteins. Consistent 
with the role of CUL7, OBSLl and CCDC8 as growth- 
promoting genes, and their association with cell cycle 
proteins, we have previously shown that fibroblast cells 
from 3-M syndrome patients with null mutations in the 
3-M genes have a significantly reduced level of cell 
proliferation compared with normal control fibroblast 
cells (Murray et al. 2013). Our analysis of the 3-M 
interactome identified that the DNA damage response 
protein XRCC5 was also one of key central network 
nodes (Fig. IB). The role the 3-M proteins have on 
XRCC5 function and DNA damage response is not 
characterised; however, there is evidence that elevated 
expression of CUL7 is associated with cancer progression 
and poor survival (Kim et al. 2007). 

The most significantly associated pathways in the 3-M 
interactome are those that are involved in the regulation 
of mRNA splicing. Mutations in splicing proteins have 
previously been associated with primordial dwarfism for 
which mutations in RNU4ATAC cause MOPDI (Nagy et al. 
2012). We have shown that overexpression of CUL7, 
OBSLl and CCDC8 results in an increase in IR-A 



Growth factors 



Insulin 
receptor 




Splicing machinery components 
targeted for degradation by the 
CUL7 E3 ligase 




Figure 3 

The CUL7-OBSL1-CCDC8 pathway and its predicted role in cell growth. 
OBSLl interacts with both CUL7 and CCDC8 (solid connections shows 
protein-protein interactions) all three associate with the mRNA splicing 
machinery with particularly high abundance of HNRNPs in the 3-M 
interactome. Alternative splicing of the Insulin receptor {INSR) is 
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modulated by CUL7, OBSLl and CCDC8, IRS-1 is also a target of the CUL7 E3 
ubiqutin ligase and this impacts on downstream signalling upon growth 
factor stimulation leading to dysfunction in MARK and AKT activation. 
This subsequently results in a reduction of cell proliferation in cells derived 
from 3-M syndrome patients. 
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expression in HEK293 cells as a result of increased levels of 
exon 11 exclusion in a minigene system. We also found 
that knockout of CUL7, OBSLl or CCDC8 in 3-M patient 
fibroblast cell models show a reduction in IR-A expression 
of the INSR minigene. 

IR-A predominantly mediates the mitogenic activity 
of insulin, whereas IR-B predominantly mediates the 
metabolic effects (Belfiore et al. 2009). Furthermore, IR-A 
is associated with increased proliferative rates, and 
elevated IR-A is found in both foetal and cancer tissues 
(Belfiore et al 2009). The Insulin and IGFl pathways are 
amongst the pathways most commonly associated with 
the 3-M interactome (Table 3) and we have previously 
demonstrated that 3-M syndrome patients show defective 
phosphorylation of AKT and MARK upon growth factor 
stimulation and clinically there is a suggestion that 3-M 
patients have a degree of GH and/or IGFl resistance 
(Hanson et al. 2012). IRS-1 is an important adaptor 
molecule downstream of the insulin, IGFl, and GH 
receptors and it has also been shown to be a target of the 
CUL7 SCF complex resulting in the dysfunction of AKT 
and MARK signalling cascades (Xu et al. 2008). 

Although preliminary these studies suggests that the 
3-M proteins themselves could be involved in the modu- 
lation of alternative splicing of INSR. However, in light of 
the already known abnormalities within the IGF system, it 
remains to be established whether the proposed modu- 
lation of INSR splicing has any direct impact on the growth 
failure seen in 3-M syndrome patients. Future studies could 
look to determine if the association of 3-M proteins with 
components of the major splicing pathways has a more 
global effect on alternative splicing events, in particular 
on other pathways identified in the 3-M interactome, 
and whether this may contribute to the pathology. 

3-M syndrome patients are typically born small for 
gestational age as a result of foetal growth restriction. 
Our previously published transcriptomic data from 3-M 
syndrome patients with null mutations in either CUL7, 
OBSLl or CCDC8 revealed that IGF2 expression is 
significantly reduced (Murray et al. 2013). The 3-M 
interactome data suggest that this could be facilitated by 
the direct interaction that we have identified with both 
IGF2BP1 and IGF2BP3, which are known to interact with 
the IGF2 5^UTR. SRS is clinically similar to 3-M syndrome 
and has been associated with epigenetic alterations of 
the IGF2/H19 locus resulting in the loss of IGF2 expre- 
ssion (Eggermann 2010). Our association of the 3-M 
syndrome proteins with this pathway may suggest that 
defects in the IGF system underlie these phenotypically 
similar NPSS conditions. 



Conclusion 

Our multi-omic approach alongside previous studies has 
identified a strong association of mRNA splicing, ubiquitina- 
tion and the IGF pathway with the function of the CUL7/ 
OBSLl /CCDC8 complex. We have also identified an associ- 
ation with cell cycle and DNA damage response pathways 
which are also found to be defective in numerous other PSS 
orders suggesting that their dysfunction is vital for postnatal 
growth. We postulate that the interactions of the 3-M proteins 
we have identified may link the disruption of CUL7 SCF 
substrate ubiquitination and their subsequent accumulation 
in 3-M syndrome to alteration of major splicing events. 
This may in turn lead to dysfunction of growth factor sig- 
nalling, resulting in growth restriction via altered cell cycle 
progression and DNA damage response (Fig. 3). 
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